fix(workflows): filter user-plugin MCP noise out of workflow warnings#1327
fix(workflows): filter user-plugin MCP noise out of workflow warnings#1327
Conversation
Before this change, the dag-executor surfaced every entry in the Claude SDK's "MCP server connection failed: …" system message to the user. That message includes user-level plugin MCPs inherited from ~/.claude/ (e.g. `telegram`) that fail to connect in the headless workflow subprocess — they're non-actionable noise for the workflow author. Fix: - Pre-compute the set of workflow-configured MCP server names per node by parsing the `mcp:` config file once at the start of executeNodeInternal. No caller-facing API change; no duplication of the provider's env-var expansion logic (we only need the keys). - Split the system-message handler: the `MCP server connection failed:` path now surfaces only the subset of failing names that match the node's configured set; user-plugin failures are debug-logged as `dag.mcp_plugin_connection_suppressed`. The `⚠️ ` branch is unchanged. Supersedes #1134 (closed as stale — the Windows HOME fix in that PR was already shipped via #1302, and the claude.ts enabledPlugins change targeted a file that has since moved into @archon/providers). Credits @MrFadiAi for identifying and reporting the underlying issue. Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThe PR adds parsing and config-loading helpers and updates the dag executor to load a node's Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as Runner/Client
participant Exec as DagExecutor
participant FS as FileSystem
participant SDK as SDK Stream
participant User as Workflow User
participant Log as Debug Logger
Runner->>Exec: executeNode(node)
Exec->>FS: loadConfiguredMcpServerNames(node.mcp, cwd)
FS-->>Exec: Set[string] of configured servers
SDK->>Exec: system message "MCP server connection failed: ..."
Exec->>Exec: parseMcpFailureServerNames(message)
alt matches configured servers
Exec->>User: forward filtered MCP failure message
else all unconfigured (plugin noise)
Exec->>Log: dag.mcp_plugin_connection_suppressed (debug)
end
SDK->>Exec: system message "⚠️ ..." (provider warning)
Exec->>User: forward provider warning verbatim
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/workflows/src/dag-executor.ts (1)
885-912: Preserve MCP failure details while filtering names.
filteredMsgforwards only server names, and the debug log stores only names, so the original failure reason/status is lost for both workflow-configured servers and suppressed plugins. Keep the original entry text per parsed name while still filtering unconfigured servers.♻️ Proposed diagnostics-preserving filter
const failedNames = parseMcpFailureServerNames(msg.content); + const failureEntryByName = new Map<string, string>(); + for (const entry of msg.content.slice(MCP_FAILURE_PREFIX.length).split(', ')) { + const name = entry.split(' (')[0]?.trim(); + if (name && !failureEntryByName.has(name)) { + failureEntryByName.set(name, entry.trim()); + } + } const workflowFailures = failedNames.filter(n => configuredMcpNames.has(n)); const pluginFailures = failedNames.filter(n => !configuredMcpNames.has(n)); if (workflowFailures.length > 0) { - const filteredMsg = `${MCP_FAILURE_PREFIX}${workflowFailures.join(', ')}`; + const filteredMsg = `${MCP_FAILURE_PREFIX}${workflowFailures + .map(n => failureEntryByName.get(n) ?? n) + .join(', ')}`; getLog().warn( { nodeId: node.id, systemContent: filteredMsg }, 'dag.provider_warning_forwarded' ); @@ if (pluginFailures.length > 0) { getLog().debug( - { nodeId: node.id, pluginFailures }, + { + nodeId: node.id, + pluginFailures: pluginFailures.map(n => failureEntryByName.get(n) ?? n), + }, 'dag.mcp_plugin_connection_suppressed' ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workflows/src/dag-executor.ts` around lines 885 - 912, The current code uses parseMcpFailureServerNames to extract only server names and loses the original failure text; update the filtering so you keep the original parsed entries (name+reason/status) while still splitting them into workflowFailures vs pluginFailures by checking configuredMcpNames. Specifically, change how failedNames is produced and filtered (references: parseMcpFailureServerNames, configuredMcpNames) so filteredMsg (MCP_FAILURE_PREFIX + ...) and the plugin debug log include the full original entry text per server rather than just the bare names, and continue to call safeSendMessage and error/log paths unchanged for workflowFailures and pluginFailures.packages/workflows/src/dag-executor.test.ts (1)
5761-5853: Add one executor-path regression test for MCP warning filtering.The helper tests are solid, but they do not prove
executeDagWorkflowactually forwards configured MCP failures and suppresses plugin failures. A small mocked-provider test would cover the changed branch end-to-end without network or timing dependencies.🧪 Example deterministic test coverage
+describe('executeDagWorkflow -- MCP plugin-noise filtering', () => { + let testDir: string; + + beforeEach(async () => { + testDir = join(tmpdir(), `dag-mcp-filter-${Date.now()}-${Math.random().toString(36).slice(2)}`); + await mkdir(testDir, { recursive: true }); + mockSendQueryDag.mockClear(); + mockGetAgentProviderDag.mockImplementation(() => ({ + sendQuery: mockSendQueryDag, + getType: () => 'claude', + getCapabilities: mockClaudeCapabilities, + })); + }); + + afterEach(async () => { + await rm(testDir, { recursive: true, force: true }); + }); + + it('forwards only workflow-configured MCP failures', async () => { + await writeFile(join(testDir, 'mcp.json'), JSON.stringify({ github: { command: 'npx' } })); + mockSendQueryDag.mockImplementation(function* () { + yield { + type: 'system', + content: 'MCP server connection failed: telegram (disconnected), github (timeout)', + }; + yield { type: 'result', sessionId: 'mcp-filter-session' }; + }); + + const platform = createMockPlatform(); + + await executeDagWorkflow( + createMockDeps(), + platform, + 'conv-mcp-filter', + testDir, + { + name: 'mcp-filter-test', + nodes: [{ id: 'review', prompt: 'Review', mcp: 'mcp.json' }], + }, + makeWorkflowRun('mcp-filter-run'), + 'claude', + undefined, + join(testDir, 'artifacts'), + join(testDir, 'logs'), + 'main', + 'docs/', + minimalConfig + ); + + const messages = (platform.sendMessage as ReturnType<typeof mock>).mock.calls.map( + (call: unknown[]) => call[1] as string + ); + expect(messages.some(m => m.includes('github'))).toBe(true); + expect(messages.some(m => m.includes('telegram'))).toBe(false); + }); +}); + describe('parseMcpFailureServerNames', () => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workflows/src/dag-executor.test.ts` around lines 5761 - 5853, Add an end-to-end unit test for executeDagWorkflow that verifies MCP warning filtering by mocking a provider to emit both MCP server connection failures and plugin-level failures, and asserting executeDagWorkflow forwards only the configured MCP failures (using loadConfiguredMcpServerNames/parseMcpFailureServerNames semantics) while suppressing plugin failures; to implement, add a test that registers a fake provider which produces deterministic log/messages, call executeDagWorkflow with a nodeMcpPath pointing at a temp config created via the existing loadConfiguredMcpServerNames helper, then assert the workflow result/logs contain the MCP server names parsed by parseMcpFailureServerNames and do not contain plugin failure entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/workflows/src/dag-executor.test.ts`:
- Around line 5761-5853: Add an end-to-end unit test for executeDagWorkflow that
verifies MCP warning filtering by mocking a provider to emit both MCP server
connection failures and plugin-level failures, and asserting executeDagWorkflow
forwards only the configured MCP failures (using
loadConfiguredMcpServerNames/parseMcpFailureServerNames semantics) while
suppressing plugin failures; to implement, add a test that registers a fake
provider which produces deterministic log/messages, call executeDagWorkflow with
a nodeMcpPath pointing at a temp config created via the existing
loadConfiguredMcpServerNames helper, then assert the workflow result/logs
contain the MCP server names parsed by parseMcpFailureServerNames and do not
contain plugin failure entries.
In `@packages/workflows/src/dag-executor.ts`:
- Around line 885-912: The current code uses parseMcpFailureServerNames to
extract only server names and loses the original failure text; update the
filtering so you keep the original parsed entries (name+reason/status) while
still splitting them into workflowFailures vs pluginFailures by checking
configuredMcpNames. Specifically, change how failedNames is produced and
filtered (references: parseMcpFailureServerNames, configuredMcpNames) so
filteredMsg (MCP_FAILURE_PREFIX + ...) and the plugin debug log include the full
original entry text per server rather than just the bare names, and continue to
call safeSendMessage and error/log paths unchanged for workflowFailures and
pluginFailures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0509c545-b54b-4669-81b4-6166ac95a096
📒 Files selected for processing (3)
CHANGELOG.mdpackages/workflows/src/dag-executor.test.tspackages/workflows/src/dag-executor.ts
PR Review Summary (multi-agent)Six specialist agents reviewed this PR. The core fix (split workflow-configured vs. plugin MCP failures) is sound and well-tested at the helper level. There is one likely-real bug, one observability gap that violates the project's fail-fast rule, and a handful of polish items. Critical Issues (2)
Important Issues (4)
Suggestions (3)
Strengths
VerdictNEEDS FIXES — the status-stripping bug (Critical 1) and the silent catch (Critical 2) should both be addressed before merge. The other items are quick polish or doc updates. Recommended Actions
|
…ervability Address review feedback on PR #1327: - parseMcpFailureServerNames now returns {name, segment} entries so the forwarded "MCP server connection failed: ..." message preserves the per-server status detail (e.g. "(timeout)", "(disconnected)") that the bare-name reconstruction was dropping. - loadConfiguredMcpServerNames now debug-logs read failures as dag.mcp_filter_config_read_failed instead of swallowing them silently. A transient EMFILE/EBUSY at filter time would otherwise silently reclassify a real workflow-MCP failure as plugin noise. - Add 4 integration tests through executeDagWorkflow covering the mixed workflow/plugin split, all-plugin suppression, no-mcp:-config nodes, and the unchanged⚠️ warning path. - Drop a WHAT comment above configuredMcpNames and a temporal phrase ("before this filter landed") that would rot on merge. - Document the filtering boundary in guides/mcp-servers.md and add a troubleshooting row for users debugging silently-suppressed plugin MCPs.
|
Pushed 7a32dde addressing the multi-agent review:
Skipped as YAGNI:
|
…h 7 (#12) * fix(workflows): make archon-adversarial-dev sed replacement macOS-safe (coleam00#1155) * fix(workflows): make adversarial init sed portable on macOS * chore: regenerate bundled-defaults after adversarial-dev sed fix Sync generated bundle with the new temp-file sed pattern in archon-adversarial-dev.yaml so check:bundled passes and binary distributions ship the macOS-safe version. --------- Co-authored-by: laplace young <yangqk12@whu.edu.cn> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com> * fix(workflows): filter user-plugin MCP noise out of workflow warnings (coleam00#1327) * fix(workflows): filter user-plugin MCP noise out of workflow warnings Before this change, the dag-executor surfaced every entry in the Claude SDK's "MCP server connection failed: …" system message to the user. That message includes user-level plugin MCPs inherited from ~/.claude/ (e.g. `telegram`) that fail to connect in the headless workflow subprocess — they're non-actionable noise for the workflow author. Fix: - Pre-compute the set of workflow-configured MCP server names per node by parsing the `mcp:` config file once at the start of executeNodeInternal. No caller-facing API change; no duplication of the provider's env-var expansion logic (we only need the keys). - Split the system-message handler: the `MCP server connection failed:` path now surfaces only the subset of failing names that match the node's configured set; user-plugin failures are debug-logged as `dag.mcp_plugin_connection_suppressed`. The `⚠️ ` branch is unchanged. Supersedes coleam00#1134 (closed as stale — the Windows HOME fix in that PR was already shipped via coleam00#1302, and the claude.ts enabledPlugins change targeted a file that has since moved into @archon/providers). Credits @MrFadiAi for identifying and reporting the underlying issue. Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> * fix(workflows): preserve MCP failure status in filtered message + observability Address review feedback on PR coleam00#1327: - parseMcpFailureServerNames now returns {name, segment} entries so the forwarded "MCP server connection failed: ..." message preserves the per-server status detail (e.g. "(timeout)", "(disconnected)") that the bare-name reconstruction was dropping. - loadConfiguredMcpServerNames now debug-logs read failures as dag.mcp_filter_config_read_failed instead of swallowing them silently. A transient EMFILE/EBUSY at filter time would otherwise silently reclassify a real workflow-MCP failure as plugin noise. - Add 4 integration tests through executeDagWorkflow covering the mixed workflow/plugin split, all-plugin suppression, no-mcp:-config nodes, and the unchanged⚠️ warning path. - Drop a WHAT comment above configuredMcpNames and a temporal phrase ("before this filter landed") that would rot on merge. - Document the filtering boundary in guides/mcp-servers.md and add a troubleshooting row for users debugging silently-suppressed plugin MCPs. --------- Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> * test(workflows): add anyFailed status derivation coverage for DAG executor (coleam00#1403) PIV Task 1: Adds three new tests in a dedicated describe block 'executeDagWorkflow -- final status derivation' covering the anyFailed branch (dag-executor.ts ~line 2956) that previously had no direct test: - one success + one independent failure calls failWorkflowRun (not completeWorkflowRun) - multiple successes + one failure calls failWorkflowRun (not completeWorkflowRun) - trigger_rule: none_failed skips dependent node but anyFailed still marks run failed Fixes coleam00#1381. * fix(workflow): migrate piv-loop plan handoff to $ARTIFACTS_DIR (coleam00#1398) * fix(workflow): migrate piv-loop plan handoff to $ARTIFACTS_DIR (coleam00#1380) The create-plan node used a relative path (.claude/archon/plans/{slug}.plan.md) that the AI agent would sometimes write to a different location, breaking all downstream nodes that glob for the plan file. Migrated all plan/progress file references to $ARTIFACTS_DIR/plan.md and $ARTIFACTS_DIR/progress.txt, matching the pattern used by archon-fix-github-issue and other workflows. Changes: - Replace slug-based plan path with $ARTIFACTS_DIR/plan.md in create-plan node - Replace ls -t glob discovery with direct $ARTIFACTS_DIR/plan.md reads in refine-plan, code-review, and fix-feedback nodes - Replace empty-string guard with file-existence check in implement-setup bash - Migrate progress.txt references in implement loop to $ARTIFACTS_DIR/ - Add explicit plan/progress paths in finalize node - Regenerated bundled-defaults.generated.ts Fixes coleam00#1380 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix(workflow): address review findings in archon-piv-loop - Rename 'Step 2: Write the Plan' to 'Step 2: Plan File Location' to eliminate the duplicate heading that collided with Step 3's identical title in the create-plan node - Guard implement-setup against a 0-task plan file: exit 1 with a clear error when no '### Task N:' sections are found, preventing a silent no-op implement loop - Remove 2>/dev/null from code-review commit so pre-commit hook failures and other stderr are visible to the agent instead of silently swallowed - Replace '|| true' on git push in finalize with an explicit WARNING echo so push failures (auth, upstream conflict, no remote) surface to the agent rather than being silently ignored - Regenerate bundled-defaults.generated.ts Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore(workflows): regenerate bundled defaults to match opus[1m] alias The bundle was stale relative to the YAML sources after coleam00#1395 merged — check:bundled was failing CI. Regenerated; no YAML edits. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore(workflows): switch default Opus pin to opus[1m] alias (coleam00#1395) Anthropic's Opus 4.7 landed 2026-04-16; on the Anthropic API, opus / opus[1m] now resolve to 4.7 with a 1M context window at standard pricing. Using the alias instead of the hard-pinned claude-opus-4-6[1m] lets bundled default workflows auto-track the recommended Opus version. No explicit effort is set, so nodes inherit the per-model default (xhigh on 4.7, high on 4.6). * fix(workflows): approval gate bypass after reject-with-redraft on resume (coleam00#1435) * fix(workflows): approval gate bypass after reject-with-redraft on resume When an approval node was rejected with on_reject.prompt, the synthetic PromptNode built to run the on_reject prompt reused the approval gate's own node ID. executeNodeInternal then wrote a node_completed event with that ID, causing getCompletedDagNodeOutputs to treat the gate as already completed on the next resume — bypassing the human gate entirely. Fix: give the synthetic node the ID `${node.id}:on_reject` so its node_completed event has a distinct step_name that won't match the approval gate slot in priorCompletedNodes. Adds a regression test asserting no node_completed event with the approval gate's ID is written during on_reject execution. Fixes coleam00#1429 * test(workflows): add positive assertion and SSE side-effect comment for on_reject synthetic node Add complementary positive assertion to the regression test to verify that node_completed is written exactly once with step_name 'review:on_reject', ensuring future refactors that suppress the event entirely would be caught. Add inline comment in executeApprovalNode documenting the known SSE side-effect: node_started/node_completed events with nodeId='review:on_reject' flow through the SSE pipeline into the web UI, resulting in a transient phantom node in the execution view. This is cosmetic-only — the human gate contract is preserved. * simplify: reduce duplicate cast pattern in on_reject test assertions * fix(workflows): concise failure messages for bash/script nodes (coleam00#1389) (coleam00#1393) * fix(workflows): concise failure messages for bash/script nodes (coleam00#1389) When a `bash:` or `script:` node fails, the executor was embedding `err.message` verbatim into the user-visible error. For inline scripts run via `bash -c <body>` / `bun -e <body>`, Node's `promisify(execFile)` puts the entire substituted script body into `err.message`, `err.cmd`, and the first line of `err.stack` — and the Pino log serialized all three, repeating the body ~3× in one structured log line. The actionable diagnostic was buried under kilobytes of echoed source. Route both catch-block default branches through a new `formatSubprocessFailure(err, label)` helper in `executor-shared.ts` that: - strips the `Command failed: <cmd>` prefix line (which carries the body) - prefers `err.stderr` — Bun/bash emit the actionable diagnostic there - tail-truncates at 2 KB with a `…[truncated]` marker - returns a controlled `logFields` subset (`exitCode`, `killed`, `stderrTail`) so Pino never re-serializes `err.message` / `err.stack` / `err.cmd` Also drops the script-node `stderrHint` concatenation — stderr is already handled by the helper, so the previous code appended it twice. Timeout / ENOENT / EACCES branches are preserved verbatim. Fixes coleam00#1389 * fix(workflows): address PR review feedback for coleam00#1389 - Run formatSubprocessFailure unconditionally so timeout / ENOENT / EACCES branches also get sanitized log fields (the timeout message also embeds the `Command failed: bash -c <body>` line). - Drop `errType: err.constructor.name` (always 'Error' in production) and replace with `nodeType: 'bash' | 'script'` for actual discriminating value. - Replace chained `||` + ternary in diagnostic selection with explicit if/else for readability. - Simplify exit suffix guard: `err.code != null` instead of double typeof. - Make stderrTail emptiness check explicit. - Drop `RawSubprocessError` export (no external consumer) and widen `code` to `number | string | null` to mirror Node's ExecFileException. - Tighten test length bound to <2100 (was <4000 / <2200) so a doubling of SUBPROCESS_ERROR_MAX_CHARS would actually trip the assertion. - Replace broad regex assertion with `.toContain('[eval]')` — the location marker is the strongest signal that diagnostic content survived. - Strip issue-number citations from describe block and test comments. - Update script-nodes.md to distinguish stderr behavior on success vs failure paths. - Add CHANGELOG Unreleased / Fixed entry for the user-visible change. * chore(workflows): regenerate bundled defaults after Tier 6 picks * docs: add CHANGELOG entry for Tier 6 workflow polish batch --------- Co-authored-by: CauchYoung <2024302072042@whu.edu.cn> Co-authored-by: laplace young <yangqk12@whu.edu.cn> Co-authored-by: Rasmus Widing <rasmus.widing@gmail.com> Co-authored-by: Rasmus Widing <152263317+Wirasm@users.noreply.github.com> Co-authored-by: Fadi Ai <MrFadiAi@users.noreply.github.com> Co-authored-by: Cole Medin <cole@dynamous.ai> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: cjnprospa <sirhcle.j23@gmail.com>
Summary
Supersedes #1134 (closed — stale and conflicting).
The Claude SDK surfaces a system message of the form:
whenever any MCP server in the subprocess fails to connect. Until now the dag-executor forwarded the entire message to the conversation. That meant user-level plugin MCPs inherited from
~/.claude/(e.g.telegram,notion) — which have no connection to what the workflow author configured and routinely fail inside a headless subprocess — were shown as workflow warnings.Fix: only surface failures for servers the workflow actually configured via
mcp:.Changes
packages/workflows/src/dag-executor.tsparseMcpFailureServerNames(message)— parses the SDK's formatted failure string into an ordered, deduped name list.loadConfiguredMcpServerNames(nodeMcpPath, cwd)— reads the node'smcp:JSON file and returnsObject.keys(). Deliberately does not env-expand (the provider owns that and will surface its own parse errors via⚠️); parse/read failures return an empty set.executeNodeInternalpre-computesconfiguredMcpNamesonce at entry.msg.type === 'system'handler now has two distinct branches:MCP server connection failed:→ split by workflow-configured vs. plugin; surface only the workflow-configured subset; debug-log the rest asdag.mcp_plugin_connection_suppressed.⚠️→ unchanged (always surface — these are author-actionable, e.g. missing env vars, Haiku+MCP, invalid structured output).packages/workflows/src/dag-executor.test.ts— 13 new tests covering both helpers (malformed input, dedup, absolute/relative paths, missing files, invalid JSON, non-object JSON).CHANGELOG.md— entry under Fixed, credits @MrFadiAi.Why this approach over #1134
The original PR also tried to pass
settings: { enabledPlugins: {} }to the Claude SDK to disable user plugins in the subprocess. We deliberately skip that: it would change subprocess behavior globally (not just silence the warning), and the SDK's plugin-disable flag is an ergonomics concession that could mask real issues. Filtering at the display layer keeps all diagnostics available in debug logs while giving workflow authors a clean conversation. If a workflow-configured MCP really fails, the warning still surfaces verbatim.Parsing the
mcp:JSON twice (once here for names, once in the provider for full env-expanded server configs) was judged cheaper than changing the provider contract to return the names it loaded. The dag-executor read is cached per node-invocation and tolerant of any file-read failure.Test plan
bun run validategreen locally (type-check + lint + format:check + full test suite)bun test packages/workflows/src/dag-executor.test.ts— 177 pass / 0 failparseMcpFailureServerNames+loadConfiguredMcpServerNamesCo-authored-by: Fadi Ai MrFadiAi@users.noreply.github.com
Summary by CodeRabbit
Bug Fixes
Tests
Documentation